The Better Predictive Model: High q for the Training Set or Low Root Mean Square Error of Prediction for the Test Set?
نویسندگان
چکیده
The process of validation of computational models (e.g., QSARs) may become the most important step in their development. Different requirements for the reliability and predictability of QSAR models have been described in the literature. Despite these formal recommendations there are few practical rules as to when to cease adding variables to a QSAR (i.e., what is an appropriate level of complexity of the model). In this work the influence of model complexity to statistical fit and error have been investigated using toxicity data for 200 phenols to the ciliated protozoan Tetrahymena pyriformis when applying a test set of a further 50 compounds. The results from this investigation showed that some important factors play a role in the definition of a good and reliable QSAR. These include the fact that q is not a good criterion for a model predictivity; that outliers should not necessarily be deleted as this may reduce the chemical space of the model; the number of descriptors in a multivariate model should be chosen carefully to avoid model underand over-estimation; and that an appropriate number of dimensions is required for PLS modelling.
منابع مشابه
Artificial Neural Network Modeling for Predicting of some Ion Concentrations in the Karaj River
The water quality of the Karaj River was studied through collecting 2137 experimental data set gained by 20 sampling stations. The data included different parameters such as T (temperature), pH, NTU (turbidity), hardness, TDS (total dissolved solids), EC (electrical conductivity) and basic anion, cation concentrations. In this study a multi-layer perceptron artificial neural network model was d...
متن کاملGlobal Solar Radiation Prediction for Makurdi, Nigeria Using Feed Forward Backward Propagation Neural Network
The optimum design of solar energy systems strongly depends on the accuracy of solar radiation data. However, the availability of accurate solar radiation data is undermined by the high cost of measuring equipment or non-functional ones. This study developed a feed-forward backpropagation artificial neural network model for prediction of global solar radiation in Makurdi, Nigeria (7.7322 N lo...
متن کاملQuantitative Modeling for Prediction of Critical Temperature of Refrigerant Compounds
The quantitative structure-property relationship (QSPR) method is used to develop the correlation between structures of refrigerants (198 compounds) and their critical temperature. Molecular descriptors calculated from structure alone were used to represent molecular structures. A subset of the calculated descriptors selected using a genetic algorithm (GA) was used in the QSPR model development...
متن کاملThe CFD Provides Data for Adaptive Neuro-Fuzzy to Model the Heat Transfer in Flat and Discontinuous Fins
In the present study, Adaptive Neuro–Fuzzy Inference System (ANFIS) approach was applied for predicting the heat transfer and air flow pressure drop on flat and discontinuous fins. The heat transfer and friction characteristics were experimentally investigated in four flat and discontinuous fins with different geometric parameters including; fin length (r), fin interruption (s), fin pitch (p), ...
متن کاملBayesian prediction of rotational torque to operate horizontal drilling
Horizontal directional drilling is usually used in drilling engineering. In a variety of conditions, it is necessary to predict the torque required for performing the drilling operation. Nevertheless, there is presently not a convenient method available to accomplish this task. In order to overcome this difficulty, the current work aims at predicting the required rotational torque (RT) to opera...
متن کامل